Overview

Dataset statistics

Number of variables20
Number of observations3989
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory596.1 KiB
Average record size in memory153.0 B

Variable types

Text3
Numeric4
Categorical11
DateTime1
Boolean1

Alerts

card_number is highly overall correlated with customer_segment and 3 other fieldsHigh correlation
trans_amount is highly overall correlated with fakeHigh correlation
customer_segment is highly overall correlated with card_numberHigh correlation
card_type is highly overall correlated with card_numberHigh correlation
customer_location is highly overall correlated with card_number and 1 other fieldsHigh correlation
merchant_name is highly overall correlated with fakeHigh correlation
trans_loc is highly overall correlated with card_number and 1 other fieldsHigh correlation
trans_currency is highly overall correlated with fakeHigh correlation
fake is highly overall correlated with trans_amount and 2 other fieldsHigh correlation
trans_currency is highly imbalanced (59.1%)Imbalance
trans_id has unique valuesUnique
trans_approval_code has unique valuesUnique

Reproduction

Analysis started2023-09-09 00:23:31.840181
Analysis finished2023-09-09 00:25:40.509862
Duration2 minutes and 8.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct988
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:41.365115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length13.177488
Min length8

Characters and Unicode

Total characters52565
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEric Scott
2nd rowEric Scott
3rd rowEric Scott
4th rowJason Jacobs
5th rowJason Jacobs
ValueCountFrequency (%)
smith 105
 
1.3%
michael 87
 
1.1%
thomas 81
 
1.0%
john 71
 
0.9%
johnson 62
 
0.8%
david 59
 
0.7%
jessica 59
 
0.7%
kevin 58
 
0.7%
brown 55
 
0.7%
wilson 54
 
0.7%
Other values (810) 7440
91.5%
2023-09-09T05:55:42.518430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4932
 
9.4%
a 4770
 
9.1%
4142
 
7.9%
n 3922
 
7.5%
r 3795
 
7.2%
i 3063
 
5.8%
o 2909
 
5.5%
l 2489
 
4.7%
s 2433
 
4.6%
t 2006
 
3.8%
Other values (43) 18104
34.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40104
76.3%
Uppercase Letter 8253
 
15.7%
Space Separator 4142
 
7.9%
Other Punctuation 66
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4932
12.3%
a 4770
11.9%
n 3922
9.8%
r 3795
9.5%
i 3063
 
7.6%
o 2909
 
7.3%
l 2489
 
6.2%
s 2433
 
6.1%
t 2006
 
5.0%
h 1808
 
4.5%
Other values (16) 7977
19.9%
Uppercase Letter
ValueCountFrequency (%)
M 908
 
11.0%
J 777
 
9.4%
C 621
 
7.5%
S 609
 
7.4%
D 563
 
6.8%
R 501
 
6.1%
B 481
 
5.8%
A 456
 
5.5%
W 438
 
5.3%
H 418
 
5.1%
Other values (15) 2481
30.1%
Space Separator
ValueCountFrequency (%)
4142
100.0%
Other Punctuation
ValueCountFrequency (%)
. 66
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48357
92.0%
Common 4208
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4932
 
10.2%
a 4770
 
9.9%
n 3922
 
8.1%
r 3795
 
7.8%
i 3063
 
6.3%
o 2909
 
6.0%
l 2489
 
5.1%
s 2433
 
5.0%
t 2006
 
4.1%
h 1808
 
3.7%
Other values (41) 16230
33.6%
Common
ValueCountFrequency (%)
4142
98.4%
. 66
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 52565
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4932
 
9.4%
a 4770
 
9.1%
4142
 
7.9%
n 3922
 
7.5%
r 3795
 
7.2%
i 3063
 
5.8%
o 2909
 
5.5%
l 2489
 
4.7%
s 2433
 
4.6%
t 2006
 
3.8%
Other values (43) 18104
34.4%

card_number
Real number (ℝ)

HIGH CORRELATION 

Distinct1000
Distinct (%)25.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.900998 × 1011
Minimum8.9289214 × 108
Maximum9.9906721 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:43.018424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8.9289214 × 108
5-th percentile4.1944075 × 1010
Q12.4409851 × 1011
median5.0215155 × 1011
Q37.418035 × 1011
95-th percentile9.5200944 × 1011
Maximum9.9906721 × 1011
Range9.9817432 × 1011
Interquartile range (IQR)4.9770499 × 1011

Descriptive statistics

Standard deviation2.9035767 × 1011
Coefficient of variation (CV)0.59244602
Kurtosis-1.1794322
Mean4.900998 × 1011
Median Absolute Deviation (MAD)2.4521463 × 1011
Skewness0.0084910101
Sum1.9550081 × 1015
Variance8.4307578 × 1022
MonotonicityNot monotonic
2023-09-09T05:55:43.471548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.855854116 × 10116
 
0.2%
6.35897845 × 10116
 
0.2%
6.436706811 × 10106
 
0.2%
5.983606543 × 10116
 
0.2%
1.372442549 × 10116
 
0.2%
7.406717455 × 10106
 
0.2%
9.292827794 × 10116
 
0.2%
3.606326657 × 10116
 
0.2%
1.496986148 × 10116
 
0.2%
7.163525352 × 10106
 
0.2%
Other values (990) 3929
98.5%
ValueCountFrequency (%)
892892140 6
0.2%
1582283666 2
 
0.1%
2265677629 4
0.1%
2603340564 4
0.1%
5753444808 6
0.2%
6187905742 3
0.1%
7026605966 2
 
0.1%
7828017180 4
0.1%
8234714276 4
0.1%
1.03752441 × 10104
0.1%
ValueCountFrequency (%)
9.990672138 × 10115
0.1%
9.984902805 × 10115
0.1%
9.963276358 × 10113
0.1%
9.950976328 × 10112
 
0.1%
9.942390924 × 10115
0.1%
9.935287527 × 10114
0.1%
9.932618476 × 10114
0.1%
9.922829315 × 10116
0.2%
9.894241454 × 10112
 
0.1%
9.891618558 × 10115
0.1%

customer_age
Real number (ℝ)

Distinct49
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.488594
Minimum18
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:43.955924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile20
Q128
median40
Q351
95-th percentile68
Maximum69
Range51
Interquartile range (IQR)23

Descriptive statistics

Standard deviation14.67239
Coefficient of variation (CV)0.35364876
Kurtosis-0.98914252
Mean41.488594
Median Absolute Deviation (MAD)12
Skewness0.28548129
Sum165498
Variance215.27902
MonotonicityNot monotonic
2023-09-09T05:55:44.534046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
39 197
 
4.9%
45 194
 
4.9%
43 177
 
4.4%
37 162
 
4.1%
22 156
 
3.9%
24 145
 
3.6%
20 144
 
3.6%
25 143
 
3.6%
32 137
 
3.4%
55 123
 
3.1%
Other values (39) 2411
60.4%
ValueCountFrequency (%)
18 29
 
0.7%
19 64
1.6%
20 144
3.6%
21 22
 
0.6%
22 156
3.9%
23 37
 
0.9%
24 145
3.6%
25 143
3.6%
26 85
2.1%
27 120
3.0%
ValueCountFrequency (%)
69 95
2.4%
68 120
3.0%
67 57
1.4%
66 33
 
0.8%
65 85
2.1%
64 108
2.7%
63 9
 
0.2%
62 46
 
1.2%
61 7
 
0.2%
60 77
1.9%

customer_segment
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Student
847 
Retail
809 
Premium
780 
Business
778 
Other
775 

Length

Max length8
Median length7
Mean length6.6036601
Min length5

Characters and Unicode

Total characters26342
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRetail
2nd rowRetail
3rd rowRetail
4th rowStudent
5th rowStudent

Common Values

ValueCountFrequency (%)
Student 847
21.2%
Retail 809
20.3%
Premium 780
19.6%
Business 778
19.5%
Other 775
19.4%

Length

2023-09-09T05:55:45.096545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:45.592423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
student 847
21.2%
retail 809
20.3%
premium 780
19.6%
business 778
19.5%
other 775
19.4%

Most occurring characters

ValueCountFrequency (%)
e 3989
15.1%
t 3278
12.4%
u 2405
9.1%
i 2367
9.0%
s 2334
8.9%
n 1625
 
6.2%
m 1560
 
5.9%
r 1555
 
5.9%
S 847
 
3.2%
d 847
 
3.2%
Other values (7) 5535
21.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22353
84.9%
Uppercase Letter 3989
 
15.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3989
17.8%
t 3278
14.7%
u 2405
10.8%
i 2367
10.6%
s 2334
10.4%
n 1625
7.3%
m 1560
 
7.0%
r 1555
 
7.0%
d 847
 
3.8%
l 809
 
3.6%
Other values (2) 1584
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
S 847
21.2%
R 809
20.3%
P 780
19.6%
B 778
19.5%
O 775
19.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 26342
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3989
15.1%
t 3278
12.4%
u 2405
9.1%
i 2367
9.0%
s 2334
8.9%
n 1625
 
6.2%
m 1560
 
5.9%
r 1555
 
5.9%
S 847
 
3.2%
d 847
 
3.2%
Other values (7) 5535
21.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26342
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3989
15.1%
t 3278
12.4%
u 2405
9.1%
i 2367
9.0%
s 2334
8.9%
n 1625
 
6.2%
m 1560
 
5.9%
r 1555
 
5.9%
S 847
 
3.2%
d 847
 
3.2%
Other values (7) 5535
21.0%

card_type
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Visa
865 
Rupay
847 
Other
768 
American Express
760 
MasterCard
749 

Length

Max length16
Median length10
Mean length7.8177488
Min length4

Characters and Unicode

Total characters31185
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowOther
3rd rowOther
4th rowVisa
5th rowVisa

Common Values

ValueCountFrequency (%)
Visa 865
21.7%
Rupay 847
21.2%
Other 768
19.3%
American Express 760
19.1%
MasterCard 749
18.8%

Length

2023-09-09T05:55:46.545348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:47.061956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
visa 865
18.2%
rupay 847
17.8%
other 768
16.2%
american 760
16.0%
express 760
16.0%
mastercard 749
15.8%

Most occurring characters

ValueCountFrequency (%)
a 3970
 
12.7%
r 3786
 
12.1%
s 3134
 
10.0%
e 3037
 
9.7%
i 1625
 
5.2%
p 1607
 
5.2%
t 1517
 
4.9%
V 865
 
2.8%
y 847
 
2.7%
u 847
 
2.7%
Other values (13) 9950
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24927
79.9%
Uppercase Letter 5498
 
17.6%
Space Separator 760
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3970
15.9%
r 3786
15.2%
s 3134
12.6%
e 3037
12.2%
i 1625
6.5%
p 1607
6.4%
t 1517
 
6.1%
y 847
 
3.4%
u 847
 
3.4%
h 768
 
3.1%
Other values (5) 3789
15.2%
Uppercase Letter
ValueCountFrequency (%)
V 865
15.7%
R 847
15.4%
O 768
14.0%
A 760
13.8%
E 760
13.8%
M 749
13.6%
C 749
13.6%
Space Separator
ValueCountFrequency (%)
760
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30425
97.6%
Common 760
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3970
13.0%
r 3786
12.4%
s 3134
 
10.3%
e 3037
 
10.0%
i 1625
 
5.3%
p 1607
 
5.3%
t 1517
 
5.0%
V 865
 
2.8%
y 847
 
2.8%
u 847
 
2.8%
Other values (12) 9190
30.2%
Common
ValueCountFrequency (%)
760
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3970
 
12.7%
r 3786
 
12.1%
s 3134
 
10.0%
e 3037
 
9.7%
i 1625
 
5.2%
p 1607
 
5.2%
t 1517
 
4.9%
V 865
 
2.8%
y 847
 
2.7%
u 847
 
2.7%
Other values (13) 9950
31.9%

customer_location
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Delhi
447 
Lucknow
447 
Mumbai
413 
Ahmedabad
410 
Pune
406 
Other values (5)
1866 

Length

Max length9
Median length7
Mean length6.8540988
Min length4

Characters and Unicode

Total characters27341
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKolkata
2nd rowKolkata
3rd rowKolkata
4th rowDelhi
5th rowDelhi

Common Values

ValueCountFrequency (%)
Delhi 447
11.2%
Lucknow 447
11.2%
Mumbai 413
10.4%
Ahmedabad 410
10.3%
Pune 406
10.2%
Kolkata 385
9.7%
Bangalore 378
9.5%
Jaipur 377
9.5%
Hyderabad 372
9.3%
Chennai 354
8.9%

Length

2023-09-09T05:55:47.657354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:48.194239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
delhi 447
11.2%
lucknow 447
11.2%
mumbai 413
10.4%
ahmedabad 410
10.3%
pune 406
10.2%
kolkata 385
9.7%
bangalore 378
9.5%
jaipur 377
9.5%
hyderabad 372
9.3%
chennai 354
8.9%

Most occurring characters

ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23352
85.4%
Uppercase Letter 3989
 
14.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4234
18.1%
e 2367
10.1%
n 1939
 
8.3%
u 1643
 
7.0%
i 1591
 
6.8%
d 1564
 
6.7%
h 1211
 
5.2%
l 1210
 
5.2%
o 1210
 
5.2%
b 1195
 
5.1%
Other values (9) 5188
22.2%
Uppercase Letter
ValueCountFrequency (%)
D 447
11.2%
L 447
11.2%
M 413
10.4%
A 410
10.3%
P 406
10.2%
K 385
9.7%
B 378
9.5%
J 377
9.5%
H 372
9.3%
C 354
8.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 27341
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

merchant_name
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
amazon gift cards
419 
instamart
418 
airtel
417 
swiggy
408 
amazon
405 
Other values (13)
1922 

Length

Max length17
Median length15
Mean length9.6269742
Min length6

Characters and Unicode

Total characters38402
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowairtel
2nd rowairtel
3rd rowfake_merchant_8
4th rowrakuten
5th rowswiggy

Common Values

ValueCountFrequency (%)
amazon gift cards 419
10.5%
instamart 418
10.5%
airtel 417
10.5%
swiggy 408
10.2%
amazon 405
10.2%
rakuten 402
10.1%
zomato 393
9.9%
chai talks 388
9.7%
fake_merchant_0 87
 
2.2%
fake_merchant_4 82
 
2.1%
Other values (8) 570
14.3%

Length

2023-09-09T05:55:48.892111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
amazon 824
15.8%
gift 419
8.0%
cards 419
8.0%
instamart 418
8.0%
airtel 417
8.0%
swiggy 408
7.8%
rakuten 402
7.7%
zomato 393
7.5%
chai 388
7.4%
talks 388
7.4%
Other values (10) 739
14.2%

Most occurring characters

ValueCountFrequency (%)
a 6369
16.6%
t 3594
 
9.4%
r 2395
 
6.2%
n 2383
 
6.2%
m 2374
 
6.2%
e 2297
 
6.0%
i 2050
 
5.3%
s 1633
 
4.3%
o 1610
 
4.2%
c 1546
 
4.0%
Other values (22) 12151
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34959
91.0%
Connector Punctuation 1478
 
3.8%
Space Separator 1226
 
3.2%
Decimal Number 739
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6369
18.2%
t 3594
10.3%
r 2395
 
6.9%
n 2383
 
6.8%
m 2374
 
6.8%
e 2297
 
6.6%
i 2050
 
5.9%
s 1633
 
4.7%
o 1610
 
4.6%
c 1546
 
4.4%
Other values (10) 8708
24.9%
Decimal Number
ValueCountFrequency (%)
0 87
11.8%
4 82
11.1%
2 80
10.8%
1 77
10.4%
9 75
10.1%
7 72
9.7%
5 72
9.7%
8 70
9.5%
6 65
8.8%
3 59
8.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1478
100.0%
Space Separator
ValueCountFrequency (%)
1226
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34959
91.0%
Common 3443
 
9.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6369
18.2%
t 3594
10.3%
r 2395
 
6.9%
n 2383
 
6.8%
m 2374
 
6.8%
e 2297
 
6.6%
i 2050
 
5.9%
s 1633
 
4.7%
o 1610
 
4.6%
c 1546
 
4.4%
Other values (10) 8708
24.9%
Common
ValueCountFrequency (%)
_ 1478
42.9%
1226
35.6%
0 87
 
2.5%
4 82
 
2.4%
2 80
 
2.3%
1 77
 
2.2%
9 75
 
2.2%
7 72
 
2.1%
5 72
 
2.1%
8 70
 
2.0%
Other values (2) 124
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38402
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6369
16.6%
t 3594
 
9.4%
r 2395
 
6.2%
n 2383
 
6.2%
m 2374
 
6.2%
e 2297
 
6.0%
i 2050
 
5.3%
s 1633
 
4.3%
o 1610
 
4.2%
c 1546
 
4.0%
Other values (22) 12151
31.6%
Distinct20
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Lucknow
 
228
New York City, USA
 
224
Hyderabad
 
219
Tokyo, Japan
 
215
Singapore, Singapore
 
215
Other values (15)
2888 

Length

Max length23
Median length17
Mean length11.666082
Min length4

Characters and Unicode

Total characters46536
Distinct characters42
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTokyo, Japan
2nd rowCape Town, South Africa
3rd rowNew York City, USA
4th rowChennai
5th rowChennai

Common Values

ValueCountFrequency (%)
Lucknow 228
 
5.7%
New York City, USA 224
 
5.6%
Hyderabad 219
 
5.5%
Tokyo, Japan 215
 
5.4%
Singapore, Singapore 215
 
5.4%
Bangalore 215
 
5.4%
Cape Town, South Africa 213
 
5.3%
Rio de Janeiro, Brazil 211
 
5.3%
Toronto, Canada 207
 
5.2%
Paris, France 204
 
5.1%
Other values (10) 1838
46.1%

Length

2023-09-09T05:55:49.444711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
singapore 430
 
5.9%
lucknow 228
 
3.1%
york 224
 
3.1%
city 224
 
3.1%
usa 224
 
3.1%
new 224
 
3.1%
hyderabad 219
 
3.0%
tokyo 215
 
2.9%
japan 215
 
2.9%
bangalore 215
 
2.9%
Other values (25) 4915
67.0%

Most occurring characters

ValueCountFrequency (%)
a 5410
 
11.6%
o 3536
 
7.6%
3344
 
7.2%
n 3238
 
7.0%
e 2840
 
6.1%
i 2805
 
6.0%
r 2705
 
5.8%
, 2048
 
4.4%
d 1613
 
3.5%
u 1378
 
3.0%
Other values (32) 17619
37.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32986
70.9%
Uppercase Letter 8158
 
17.5%
Space Separator 3344
 
7.2%
Other Punctuation 2048
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5410
16.4%
o 3536
10.7%
n 3238
9.8%
e 2840
 
8.6%
i 2805
 
8.5%
r 2705
 
8.2%
d 1613
 
4.9%
u 1378
 
4.2%
p 1056
 
3.2%
y 996
 
3.0%
Other values (12) 7409
22.5%
Uppercase Letter
ValueCountFrequency (%)
S 1036
12.7%
A 1002
12.3%
C 826
10.1%
T 635
 
7.8%
J 624
 
7.6%
U 614
 
7.5%
B 426
 
5.2%
L 420
 
5.1%
P 395
 
4.8%
D 371
 
4.5%
Other values (8) 1809
22.2%
Space Separator
ValueCountFrequency (%)
3344
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2048
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41144
88.4%
Common 5392
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5410
 
13.1%
o 3536
 
8.6%
n 3238
 
7.9%
e 2840
 
6.9%
i 2805
 
6.8%
r 2705
 
6.6%
d 1613
 
3.9%
u 1378
 
3.3%
p 1056
 
2.6%
S 1036
 
2.5%
Other values (30) 15527
37.7%
Common
ValueCountFrequency (%)
3344
62.0%
, 2048
38.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5410
 
11.6%
o 3536
 
7.6%
3344
 
7.2%
n 3238
 
7.0%
e 2840
 
6.1%
i 2805
 
6.0%
r 2705
 
5.8%
, 2048
 
4.4%
d 1613
 
3.5%
u 1378
 
3.0%
Other values (32) 17619
37.9%

trans_id
Text

UNIQUE 

Distinct3989
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:50.170781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters71802
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3989 ?
Unique (%)100.0%

Sample

1st rowQc9M36mAY1s6rxld9M
2nd rowHlrx3ReSC0IJRYzBy5
3rd rowqScFTTzOUZD3M0rcg5
4th row0cOkB7x1L6DB5WzKvu
5th rowO7jfbyedchAWUScw30
ValueCountFrequency (%)
qc9m36may1s6rxld9m 1
 
< 0.1%
orrmctmsyrgfzyv0hn 1
 
< 0.1%
i6lhsbbk0g3xk5m2eu 1
 
< 0.1%
qscfttzouzd3m0rcg5 1
 
< 0.1%
0cokb7x1l6db5wzkvu 1
 
< 0.1%
o7jfbyedchawuscw30 1
 
< 0.1%
cg6y9khge0enh3a6az 1
 
< 0.1%
ntqm2jxqsimmysqsp1 1
 
< 0.1%
tetj62xq7zjwett6ia 1
 
< 0.1%
j3xnniyk9ww7x0zdn1 1
 
< 0.1%
Other values (3979) 3979
99.7%
2023-09-09T05:55:51.151269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1227
 
1.7%
4 1218
 
1.7%
O 1214
 
1.7%
6 1214
 
1.7%
S 1203
 
1.7%
8 1199
 
1.7%
o 1198
 
1.7%
Q 1196
 
1.7%
d 1195
 
1.7%
2 1192
 
1.7%
Other values (52) 59746
83.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30092
41.9%
Uppercase Letter 30000
41.8%
Decimal Number 11710
 
16.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1227
 
4.1%
o 1198
 
4.0%
d 1195
 
4.0%
m 1188
 
3.9%
h 1183
 
3.9%
u 1181
 
3.9%
k 1178
 
3.9%
j 1174
 
3.9%
v 1172
 
3.9%
f 1172
 
3.9%
Other values (16) 18224
60.6%
Uppercase Letter
ValueCountFrequency (%)
O 1214
 
4.0%
S 1203
 
4.0%
Q 1196
 
4.0%
K 1180
 
3.9%
W 1179
 
3.9%
M 1175
 
3.9%
E 1171
 
3.9%
A 1171
 
3.9%
B 1170
 
3.9%
I 1168
 
3.9%
Other values (16) 18173
60.6%
Decimal Number
ValueCountFrequency (%)
4 1218
10.4%
6 1214
10.4%
8 1199
10.2%
2 1192
10.2%
0 1169
10.0%
9 1163
9.9%
1 1162
9.9%
3 1160
9.9%
5 1136
9.7%
7 1097
9.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 60092
83.7%
Common 11710
 
16.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1227
 
2.0%
O 1214
 
2.0%
S 1203
 
2.0%
o 1198
 
2.0%
Q 1196
 
2.0%
d 1195
 
2.0%
m 1188
 
2.0%
h 1183
 
2.0%
u 1181
 
2.0%
K 1180
 
2.0%
Other values (42) 48127
80.1%
Common
ValueCountFrequency (%)
4 1218
10.4%
6 1214
10.4%
8 1199
10.2%
2 1192
10.2%
0 1169
10.0%
9 1163
9.9%
1 1162
9.9%
3 1160
9.9%
5 1136
9.7%
7 1097
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 71802
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1227
 
1.7%
4 1218
 
1.7%
O 1214
 
1.7%
6 1214
 
1.7%
S 1203
 
1.7%
8 1199
 
1.7%
o 1198
 
1.7%
Q 1196
 
1.7%
d 1195
 
1.7%
2 1192
 
1.7%
Other values (52) 59746
83.2%

trans_approval_code
Text

UNIQUE 

Distinct3989
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:51.903965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters23934
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3989 ?
Unique (%)100.0%

Sample

1st rowL6TGCC
2nd rowXTIUMZ
3rd rowEX8G40
4th rowPWKEUM
5th rowT70W7M
ValueCountFrequency (%)
l6tgcc 1
 
< 0.1%
ajipac 1
 
< 0.1%
ltrw5f 1
 
< 0.1%
ex8g40 1
 
< 0.1%
pwkeum 1
 
< 0.1%
t70w7m 1
 
< 0.1%
4p6ao8 1
 
< 0.1%
5mwotx 1
 
< 0.1%
2m3ybc 1
 
< 0.1%
3rvjp4 1
 
< 0.1%
Other values (3979) 3979
99.7%
2023-09-09T05:55:52.937558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 724
 
3.0%
J 716
 
3.0%
7 713
 
3.0%
1 700
 
2.9%
B 699
 
2.9%
9 689
 
2.9%
6 683
 
2.9%
D 680
 
2.8%
S 680
 
2.8%
O 678
 
2.8%
Other values (26) 16972
70.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 17201
71.9%
Decimal Number 6733
 
28.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 724
 
4.2%
J 716
 
4.2%
B 699
 
4.1%
D 680
 
4.0%
S 680
 
4.0%
O 678
 
3.9%
E 677
 
3.9%
G 674
 
3.9%
T 673
 
3.9%
P 671
 
3.9%
Other values (16) 10329
60.0%
Decimal Number
ValueCountFrequency (%)
7 713
10.6%
1 700
10.4%
9 689
10.2%
6 683
10.1%
2 674
10.0%
5 669
9.9%
3 661
9.8%
8 657
9.8%
0 645
9.6%
4 642
9.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 17201
71.9%
Common 6733
 
28.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 724
 
4.2%
J 716
 
4.2%
B 699
 
4.1%
D 680
 
4.0%
S 680
 
4.0%
O 678
 
3.9%
E 677
 
3.9%
G 674
 
3.9%
T 673
 
3.9%
P 671
 
3.9%
Other values (16) 10329
60.0%
Common
ValueCountFrequency (%)
7 713
10.6%
1 700
10.4%
9 689
10.2%
6 683
10.1%
2 674
10.0%
5 669
9.9%
3 661
9.8%
8 657
9.8%
0 645
9.6%
4 642
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 724
 
3.0%
J 716
 
3.0%
7 713
 
3.0%
1 700
 
2.9%
B 699
 
2.9%
9 689
 
2.9%
6 683
 
2.9%
D 680
 
2.8%
S 680
 
2.8%
O 678
 
2.8%
Other values (26) 16972
70.9%

trans_loc
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Delhi
447 
Lucknow
447 
Mumbai
413 
Ahmedabad
410 
Pune
406 
Other values (5)
1866 

Length

Max length9
Median length7
Mean length6.8540988
Min length4

Characters and Unicode

Total characters27341
Distinct characters29
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKolkata
2nd rowKolkata
3rd rowKolkata
4th rowDelhi
5th rowDelhi

Common Values

ValueCountFrequency (%)
Delhi 447
11.2%
Lucknow 447
11.2%
Mumbai 413
10.4%
Ahmedabad 410
10.3%
Pune 406
10.2%
Kolkata 385
9.7%
Bangalore 378
9.5%
Jaipur 377
9.5%
Hyderabad 372
9.3%
Chennai 354
8.9%

Length

2023-09-09T05:55:53.402759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:53.889627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
delhi 447
11.2%
lucknow 447
11.2%
mumbai 413
10.4%
ahmedabad 410
10.3%
pune 406
10.2%
kolkata 385
9.7%
bangalore 378
9.5%
jaipur 377
9.5%
hyderabad 372
9.3%
chennai 354
8.9%

Most occurring characters

ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23352
85.4%
Uppercase Letter 3989
 
14.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4234
18.1%
e 2367
10.1%
n 1939
 
8.3%
u 1643
 
7.0%
i 1591
 
6.8%
d 1564
 
6.7%
h 1211
 
5.2%
l 1210
 
5.2%
o 1210
 
5.2%
b 1195
 
5.1%
Other values (9) 5188
22.2%
Uppercase Letter
ValueCountFrequency (%)
D 447
11.2%
L 447
11.2%
M 413
10.4%
A 410
10.3%
P 406
10.2%
K 385
9.7%
B 378
9.5%
J 377
9.5%
H 372
9.3%
C 354
8.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 27341
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4234
15.5%
e 2367
 
8.7%
n 1939
 
7.1%
u 1643
 
6.0%
i 1591
 
5.8%
d 1564
 
5.7%
h 1211
 
4.4%
l 1210
 
4.4%
o 1210
 
4.4%
b 1195
 
4.4%
Other values (19) 9177
33.6%

trans_cat
Categorical

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Other
606 
Entertainment
583 
Grocery
580 
Dining
574 
Retail
573 
Other values (2)
1073 

Length

Max length13
Median length9
Mean length7.4151416
Min length5

Characters and Unicode

Total characters29579
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTravel
2nd rowEntertainment
3rd rowDining
4th rowRetail
5th rowGrocery

Common Values

ValueCountFrequency (%)
Other 606
15.2%
Entertainment 583
14.6%
Grocery 580
14.5%
Dining 574
14.4%
Retail 573
14.4%
Travel 543
13.6%
Utilities 530
13.3%

Length

2023-09-09T05:55:54.805800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:55.305336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
other 606
15.2%
entertainment 583
14.6%
grocery 580
14.5%
dining 574
14.4%
retail 573
14.4%
travel 543
13.6%
utilities 530
13.3%

Most occurring characters

ValueCountFrequency (%)
e 3998
13.5%
t 3988
13.5%
i 3894
13.2%
n 2897
9.8%
r 2892
9.8%
a 1699
 
5.7%
l 1646
 
5.6%
O 606
 
2.0%
h 606
 
2.0%
E 583
 
2.0%
Other values (12) 6770
22.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25590
86.5%
Uppercase Letter 3989
 
13.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3998
15.6%
t 3988
15.6%
i 3894
15.2%
n 2897
11.3%
r 2892
11.3%
a 1699
6.6%
l 1646
6.4%
h 606
 
2.4%
m 583
 
2.3%
y 580
 
2.3%
Other values (5) 2807
11.0%
Uppercase Letter
ValueCountFrequency (%)
O 606
15.2%
E 583
14.6%
G 580
14.5%
D 574
14.4%
R 573
14.4%
T 543
13.6%
U 530
13.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 29579
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3998
13.5%
t 3988
13.5%
i 3894
13.2%
n 2897
9.8%
r 2892
9.8%
a 1699
 
5.7%
l 1646
 
5.6%
O 606
 
2.0%
h 606
 
2.0%
E 583
 
2.0%
Other values (12) 6770
22.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29579
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3998
13.5%
t 3988
13.5%
i 3894
13.2%
n 2897
9.8%
r 2892
9.8%
a 1699
 
5.7%
l 1646
 
5.6%
O 606
 
2.0%
h 606
 
2.0%
E 583
 
2.0%
Other values (12) 6770
22.9%

trans_currency
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
INR
3268 
USD
 
132
Other
 
123
JPY
 
121
CAD
 
116
Other values (2)
 
229

Length

Max length5
Median length3
Mean length3.0616696
Min length3

Characters and Unicode

Total characters12213
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowINR
2nd rowINR
3rd rowCAD
4th rowINR
5th rowINR

Common Values

ValueCountFrequency (%)
INR 3268
81.9%
USD 132
 
3.3%
Other 123
 
3.1%
JPY 121
 
3.0%
CAD 116
 
2.9%
EUR 116
 
2.9%
AUD 113
 
2.8%

Length

2023-09-09T05:55:55.849154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:55:56.302606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
inr 3268
81.9%
usd 132
 
3.3%
other 123
 
3.1%
jpy 121
 
3.0%
cad 116
 
2.9%
eur 116
 
2.9%
aud 113
 
2.8%

Most occurring characters

ValueCountFrequency (%)
R 3384
27.7%
I 3268
26.8%
N 3268
26.8%
U 361
 
3.0%
D 361
 
3.0%
A 229
 
1.9%
S 132
 
1.1%
e 123
 
1.0%
r 123
 
1.0%
h 123
 
1.0%
Other values (7) 841
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 11721
96.0%
Lowercase Letter 492
 
4.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3384
28.9%
I 3268
27.9%
N 3268
27.9%
U 361
 
3.1%
D 361
 
3.1%
A 229
 
2.0%
S 132
 
1.1%
O 123
 
1.0%
J 121
 
1.0%
P 121
 
1.0%
Other values (3) 353
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
e 123
25.0%
r 123
25.0%
h 123
25.0%
t 123
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12213
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3384
27.7%
I 3268
26.8%
N 3268
26.8%
U 361
 
3.0%
D 361
 
3.0%
A 229
 
1.9%
S 132
 
1.1%
e 123
 
1.0%
r 123
 
1.0%
h 123
 
1.0%
Other values (7) 841
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 3384
27.7%
I 3268
26.8%
N 3268
26.8%
U 361
 
3.0%
D 361
 
3.0%
A 229
 
1.9%
S 132
 
1.1%
e 123
 
1.0%
r 123
 
1.0%
h 123
 
1.0%
Other values (7) 841
 
6.9%

mcc
Real number (ℝ)

Distinct20
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4008.6007
Minimum4000
Maximum4019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:56.691120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4000
5-th percentile4000
Q14004
median4008
Q34013
95-th percentile4018
Maximum4019
Range19
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.6662181
Coefficient of variation (CV)0.0014135152
Kurtosis-1.1813784
Mean4008.6007
Median Absolute Deviation (MAD)5
Skewness0.16950742
Sum15990308
Variance32.106028
MonotonicityNot monotonic
2023-09-09T05:55:57.505874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4004 322
 
8.1%
4003 316
 
7.9%
4017 313
 
7.8%
4008 303
 
7.6%
4018 277
 
6.9%
4000 272
 
6.8%
4001 245
 
6.1%
4013 234
 
5.9%
4011 233
 
5.8%
4006 211
 
5.3%
Other values (10) 1263
31.7%
ValueCountFrequency (%)
4000 272
6.8%
4001 245
6.1%
4002 107
 
2.7%
4003 316
7.9%
4004 322
8.1%
4005 211
5.3%
4006 211
5.3%
4007 69
 
1.7%
4008 303
7.6%
4009 176
4.4%
ValueCountFrequency (%)
4019 3
 
0.1%
4018 277
6.9%
4017 313
7.8%
4016 153
3.8%
4015 22
 
0.6%
4014 106
 
2.7%
4013 234
5.9%
4012 205
5.1%
4011 233
5.8%
4010 211
5.3%

trans_amount
Real number (ℝ)

HIGH CORRELATION 

Distinct3295
Distinct (%)82.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10926.308
Minimum1.01
Maximum49983.621
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.3 KiB
2023-09-09T05:55:58.759421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.01
5-th percentile9
Q141
median2366.77
Q319563
95-th percentile44238.835
Maximum49983.621
Range49982.611
Interquartile range (IQR)19522

Descriptive statistics

Standard deviation15157.537
Coefficient of variation (CV)1.3872515
Kurtosis0.044927071
Mean10926.308
Median Absolute Deviation (MAD)2344.77
Skewness1.2269281
Sum43585042
Variance2.2975094 × 108
MonotonicityNot monotonic
2023-09-09T05:55:59.217199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50 21
 
0.5%
12 20
 
0.5%
41 20
 
0.5%
47 19
 
0.5%
10 18
 
0.5%
33 18
 
0.5%
9 17
 
0.4%
4 17
 
0.4%
37 16
 
0.4%
46 16
 
0.4%
Other values (3285) 3807
95.4%
ValueCountFrequency (%)
1.00999999 1
< 0.1%
1.059999943 1
< 0.1%
1.070000052 1
< 0.1%
1.220000029 1
< 0.1%
1.460000038 1
< 0.1%
1.710000038 1
< 0.1%
1.720000029 1
< 0.1%
1.75 1
< 0.1%
1.789999962 1
< 0.1%
1.879999995 1
< 0.1%
ValueCountFrequency (%)
49983.62109 1
< 0.1%
49875 1
< 0.1%
49862 1
< 0.1%
49828 1
< 0.1%
49795.46094 1
< 0.1%
49781 1
< 0.1%
49710 1
< 0.1%
49696 1
< 0.1%
49691 1
< 0.1%
49676.44922 1
< 0.1%
Distinct3987
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Minimum2023-08-10 05:56:41
Maximum2023-09-09 05:52:02
2023-09-09T05:55:59.607820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:56:00.029694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Credit Card
1998 
Debit Card
1991 

Length

Max length11
Median length11
Mean length10.500877
Min length10

Characters and Unicode

Total characters41888
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDebit Card
2nd rowCredit Card
3rd rowDebit Card
4th rowCredit Card
5th rowCredit Card

Common Values

ValueCountFrequency (%)
Credit Card 1998
50.1%
Debit Card 1991
49.9%

Length

2023-09-09T05:56:00.524489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:56:00.834743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
card 3989
50.0%
credit 1998
25.0%
debit 1991
25.0%

Most occurring characters

ValueCountFrequency (%)
C 5987
14.3%
r 5987
14.3%
d 5987
14.3%
e 3989
9.5%
i 3989
9.5%
t 3989
9.5%
3989
9.5%
a 3989
9.5%
D 1991
 
4.8%
b 1991
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29921
71.4%
Uppercase Letter 7978
 
19.0%
Space Separator 3989
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 5987
20.0%
d 5987
20.0%
e 3989
13.3%
i 3989
13.3%
t 3989
13.3%
a 3989
13.3%
b 1991
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
C 5987
75.0%
D 1991
 
25.0%
Space Separator
ValueCountFrequency (%)
3989
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37899
90.5%
Common 3989
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 5987
15.8%
r 5987
15.8%
d 5987
15.8%
e 3989
10.5%
i 3989
10.5%
t 3989
10.5%
a 3989
10.5%
D 1991
 
5.3%
b 1991
 
5.3%
Common
ValueCountFrequency (%)
3989
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 5987
14.3%
r 5987
14.3%
d 5987
14.3%
e 3989
9.5%
i 3989
9.5%
t 3989
9.5%
3989
9.5%
a 3989
9.5%
D 1991
 
4.8%
b 1991
 
4.8%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
PIN
2001 
Biometric
1988 

Length

Max length9
Median length3
Mean length5.9902231
Min length3

Characters and Unicode

Total characters23895
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBiometric
2nd rowBiometric
3rd rowPIN
4th rowBiometric
5th rowBiometric

Common Values

ValueCountFrequency (%)
PIN 2001
50.2%
Biometric 1988
49.8%

Length

2023-09-09T05:56:01.158611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:56:01.580482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
pin 2001
50.2%
biometric 1988
49.8%

Most occurring characters

ValueCountFrequency (%)
i 3976
16.6%
P 2001
8.4%
I 2001
8.4%
N 2001
8.4%
B 1988
8.3%
o 1988
8.3%
m 1988
8.3%
e 1988
8.3%
t 1988
8.3%
r 1988
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15904
66.6%
Uppercase Letter 7991
33.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3976
25.0%
o 1988
12.5%
m 1988
12.5%
e 1988
12.5%
t 1988
12.5%
r 1988
12.5%
c 1988
12.5%
Uppercase Letter
ValueCountFrequency (%)
P 2001
25.0%
I 2001
25.0%
N 2001
25.0%
B 1988
24.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 23895
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3976
16.6%
P 2001
8.4%
I 2001
8.4%
N 2001
8.4%
B 1988
8.3%
o 1988
8.3%
m 1988
8.3%
e 1988
8.3%
t 1988
8.3%
r 1988
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23895
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3976
16.6%
P 2001
8.4%
I 2001
8.4%
N 2001
8.4%
B 1988
8.3%
o 1988
8.3%
m 1988
8.3%
e 1988
8.3%
t 1988
8.3%
r 1988
8.3%

trans_status
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.3 KiB
Payment
1364 
Transfer
1334 
Purchase
1291 

Length

Max length8
Median length8
Mean length7.6580597
Min length7

Characters and Unicode

Total characters30548
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPayment
2nd rowPayment
3rd rowTransfer
4th rowPayment
5th rowTransfer

Common Values

ValueCountFrequency (%)
Payment 1364
34.2%
Transfer 1334
33.4%
Purchase 1291
32.4%

Length

2023-09-09T05:56:01.977230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-09T05:56:02.309561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
payment 1364
34.2%
transfer 1334
33.4%
purchase 1291
32.4%

Most occurring characters

ValueCountFrequency (%)
a 3989
13.1%
e 3989
13.1%
r 3959
13.0%
n 2698
8.8%
P 2655
8.7%
s 2625
8.6%
y 1364
 
4.5%
m 1364
 
4.5%
t 1364
 
4.5%
T 1334
 
4.4%
Other values (4) 5207
17.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26559
86.9%
Uppercase Letter 3989
 
13.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3989
15.0%
e 3989
15.0%
r 3959
14.9%
n 2698
10.2%
s 2625
9.9%
y 1364
 
5.1%
m 1364
 
5.1%
t 1364
 
5.1%
f 1334
 
5.0%
u 1291
 
4.9%
Other values (2) 2582
9.7%
Uppercase Letter
ValueCountFrequency (%)
P 2655
66.6%
T 1334
33.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 30548
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3989
13.1%
e 3989
13.1%
r 3959
13.0%
n 2698
8.8%
P 2655
8.7%
s 2625
8.6%
y 1364
 
4.5%
m 1364
 
4.5%
t 1364
 
4.5%
T 1334
 
4.4%
Other values (4) 5207
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30548
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3989
13.1%
e 3989
13.1%
r 3959
13.0%
n 2698
8.8%
P 2655
8.7%
s 2625
8.6%
y 1364
 
4.5%
m 1364
 
4.5%
t 1364
 
4.5%
T 1334
 
4.4%
Other values (4) 5207
17.0%

fake
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
False
2512 
True
1477 
ValueCountFrequency (%)
False 2512
63.0%
True 1477
37.0%
2023-09-09T05:56:02.851406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-09-09T05:55:22.812523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:53:42.955400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:54:44.829232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:06.677897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:37.870945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:54:09.780191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:05.746608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:21.249990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:38.192349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:54:20.342376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:06.049385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:21.602203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:38.514435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:54:32.079476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:06.365849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-09-09T05:55:22.433851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-09-09T05:56:03.240284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
card_numbercustomer_agemcctrans_amountcustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_loctrans_cattrans_currencytrans_payment_methodtrans_verify_methodtrans_statusfake
card_number1.000-0.030-0.0180.0020.8660.8660.8670.0740.0350.8670.0440.0860.0000.1170.0990.000
customer_age-0.0301.0000.025-0.0160.0920.1030.0830.0160.0280.0830.0380.0250.0420.0000.0320.000
mcc-0.0180.0251.0000.0160.0000.0160.0000.0000.0270.0000.0250.0000.0000.0000.0000.004
trans_amount0.002-0.0160.0161.0000.0000.0000.0000.2160.0000.0000.0000.2530.0000.0300.0220.999
customer_segment0.8660.0920.0000.0001.0000.0500.0750.0000.0240.0750.0000.0000.0000.0000.0000.000
card_type0.8660.1030.0160.0000.0501.0000.0840.0310.0000.0840.0000.0000.0000.0000.0000.000
customer_location0.8670.0830.0000.0000.0750.0841.0000.0100.0001.0000.0000.0160.0150.0000.0350.000
merchant_name0.0740.0160.0000.2160.0000.0310.0101.0000.0000.0100.0000.1760.0000.0350.0200.619
merchant_location0.0350.0280.0270.0000.0240.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.000
trans_loc0.8670.0830.0000.0000.0750.0841.0000.0100.0001.0000.0000.0160.0150.0000.0350.000
trans_cat0.0440.0380.0250.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.000
trans_currency0.0860.0250.0000.2530.0000.0000.0160.1760.0000.0160.0001.0000.0000.0000.0000.611
trans_payment_method0.0000.0420.0000.0000.0000.0000.0150.0000.0000.0150.0000.0001.0000.0000.0000.000
trans_verify_method0.1170.0000.0000.0300.0000.0000.0000.0350.0000.0000.0000.0000.0001.0000.0240.000
trans_status0.0990.0320.0000.0220.0000.0000.0350.0200.0000.0350.0000.0000.0000.0241.0000.000
fake0.0000.0000.0040.9990.0000.0000.0000.6190.0000.0000.0000.6110.0000.0000.0001.000

Missing values

2023-09-09T05:55:39.007255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-09T05:55:39.913171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

cardholder_namecard_numbercustomer_agecustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_idtrans_approval_codetrans_loctrans_cattrans_currencymcctrans_amounttrans_datetrans_payment_methodtrans_verify_methodtrans_statusfake
0Eric Scott69792809744449RetailOtherKolkataairtelTokyo, JapanQc9M36mAY1s6rxld9ML6TGCCKolkataTravelINR401331.7300002023-08-18 08:17:12Debit CardBiometricPaymentFalse
1Eric Scott69792809744449RetailOtherKolkataairtelCape Town, South AfricaHlrx3ReSC0IJRYzBy5XTIUMZKolkataEntertainmentINR40171358.2600102023-08-28 12:16:07Credit CardBiometricPaymentFalse
2Eric Scott69792809744449RetailOtherKolkatafake_merchant_8New York City, USAqScFTTzOUZD3M0rcg5EX8G40KolkataDiningCAD40077453.2700202023-09-03 00:55:30Debit CardPINTransferTrue
3Jason Jacobs35750049704655StudentVisaDelhirakutenChennai0cOkB7x1L6DB5WzKvuPWKEUMDelhiRetailINR40102781.9099122023-08-16 04:00:04Credit CardBiometricPaymentFalse
4Jason Jacobs35750049704655StudentVisaDelhiswiggyChennaiO7jfbyedchAWUScw30T70W7MDelhiGroceryINR401839.0000002023-09-02 15:01:38Credit CardBiometricTransferFalse
5Jason Jacobs35750049704655StudentVisaDelhichai talksDubai, UAEcg6Y9kHGe0ENh3a6Az4P6AO8DelhiRetailUSD401635631.9687502023-09-02 15:02:08Credit CardBiometricPaymentTrue
6Heather Lewis26319105875336StudentAmerican ExpressBangalorezomatoJaipurntqm2jxQsiMmYsQSp15MWOTXBangaloreRetailINR401710.0000002023-08-10 11:25:14Credit CardBiometricPaymentFalse
7Heather Lewis26319105875336StudentAmerican ExpressBangaloreamazon gift cardsCape Town, South AfricaTETj62XQ7zJWeTt6iA2M3YBCBangaloreRetailINR40021980.0000002023-08-15 14:43:18Credit CardPINTransferFalse
8Heather Lewis26319105875336StudentAmerican ExpressBangaloreinstamartSingapore, Singaporej3xnNiyK9WW7X0ZDn13RVJP4BangaloreRetailINR40132141.0000002023-08-23 13:14:19Credit CardPINPaymentFalse
9Heather Lewis26319105875336StudentAmerican ExpressBangalorerakutenNew York City, USABKDVY2pQ4lyAo61Q28NFMS2NBangaloreOtherINR400021.1200012023-09-01 09:19:27Debit CardBiometricTransferFalse
cardholder_namecard_numbercustomer_agecustomer_segmentcard_typecustomer_locationmerchant_namemerchant_locationtrans_idtrans_approval_codetrans_loctrans_cattrans_currencymcctrans_amounttrans_datetrans_payment_methodtrans_verify_methodtrans_statusfake
3979Kevin West80074648642368StudentAmerican ExpressAhmedabadrakutenAhmedabadZRfTRK6WDs2vC76vcU01LO9GAhmedabadEntertainmentINR401112.1100002023-08-13 03:22:30Debit CardBiometricTransferFalse
3980Kevin West80074648642368StudentAmerican ExpressAhmedabadairtelKolkatagRhP8HMoK9zbFxFje20ENJETAhmedabadGroceryINR40181296.3599852023-08-13 07:07:02Debit CardPINPaymentFalse
3981Kevin West80074648642368StudentAmerican ExpressAhmedabadrakutenLondon, UKTNfO996g9nXQRrVZilFAP42SAhmedabadOtherINR401645.4700012023-08-15 02:10:33Debit CardBiometricPurchaseFalse
3982Kevin West80074648642368StudentAmerican ExpressAhmedabadrakutenCape Town, South AfricaRV9VxUPx46TKGRR0hXGGP97JAhmedabadGroceryINR4017727.9600222023-09-08 21:57:31Credit CardBiometricPaymentFalse
3983Kevin West80074648642368StudentAmerican ExpressAhmedabadfake_merchant_1London, UKEWbU2kaxw2ayQNcON1VCYUXQAhmedabadUtilitiesJPY401638564.0000002023-09-08 21:57:56Debit CardPINPurchaseTrue
3984Jennifer Walters22036195839021PremiumVisaPunerakutenParis, FrancePbCYi8tK618jq6oLvR0SIT2YPuneDiningINR400938.0700002023-08-12 10:49:35Credit CardPINPurchaseFalse
3985Jennifer Walters22036195839021PremiumVisaPuneinstamartHyderabaduf8bySvRheHRdqbhMfBNNAJPPuneDiningINR4015227.8099982023-08-18 15:53:11Debit CardPINPurchaseFalse
3986Jennifer Walters22036195839021PremiumVisaPuneinstamartRio de Janeiro, BrazilvR2AoeftxNSunTyUel8FI4S6PuneRetailINR401330.0000002023-08-31 12:35:52Debit CardBiometricPurchaseFalse
3987Jennifer Walters22036195839021PremiumVisaPunerakutenHyderabadAOV2MnXgSNebfQard9LSDJX9PuneTravelINR401130.0000002023-09-02 15:03:02Credit CardBiometricPurchaseFalse
3988Jennifer Walters22036195839021PremiumVisaPuneswiggyChennaiNC9WbXNykaSJNywAN449ULH0PuneEntertainmentINR400037079.4804692023-09-02 16:51:07Debit CardPINPurchaseTrue